Skip to content

Coalesce function unsubscribes only after receiving 3 samples#1374

Open
simonvoelcker wants to merge 3 commits intofrequenz-floss:v1.x.xfrom
simonvoelcker:coalesce_cautious_unsubscribe
Open

Coalesce function unsubscribes only after receiving 3 samples#1374
simonvoelcker wants to merge 3 commits intofrequenz-floss:v1.x.xfrom
simonvoelcker:coalesce_cautious_unsubscribe

Conversation

@simonvoelcker
Copy link
Contributor

@simonvoelcker simonvoelcker commented Mar 13, 2026

Addresses #1332.

Background

The Coalesce function subscribes and unsubscribes from components based on data availability. If it receives None-values, it will subscribe to the next component, if any, from its parameter list. Once the other components start sending data again, we may unsubscribe from "later" components. We currently unsubscribe from all components "after" a given one once that given one sends us one non-None sample. This is undesired because we may receive intermittent None values for various reasons and would then unsubscribe and resubscribe excessively.

Change

The solution is to count non-None samples from earlier streams and unsubscribe only once a threshold (3) is reached.

Discussion

If there are only two components (main, backup), it is quite straightforward. With more components, although probably rarely seen in practice, one could come up with a more sophisticated solution that tracks each components samples and marks them as stable when 3 samples are reached. Happy to discuss, but I think it's overkill.

@github-actions github-actions bot added part:tests Affects the unit, integration and performance (benchmarks) tests part:data-pipeline Affects the data pipeline labels Mar 13, 2026
@simonvoelcker simonvoelcker force-pushed the coalesce_cautious_unsubscribe branch 2 times, most recently from dae4a2a to c69f4a4 Compare March 13, 2026 14:24
@github-actions github-actions bot added the part:docs Affects the documentation label Mar 13, 2026
@simonvoelcker simonvoelcker force-pushed the coalesce_cautious_unsubscribe branch 2 times, most recently from 89eb72f to 6a483f0 Compare March 13, 2026 14:32
@simonvoelcker simonvoelcker marked this pull request as ready for review March 13, 2026 14:52
@simonvoelcker simonvoelcker requested a review from a team as a code owner March 13, 2026 14:52
@simonvoelcker simonvoelcker requested review from shsms and removed request for a team March 13, 2026 14:52
@simonvoelcker simonvoelcker self-assigned this Mar 16, 2026
@simonvoelcker simonvoelcker enabled auto-merge March 17, 2026 10:00
Copy link
Contributor

@llucax llucax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general, all comments are minor enough to be optional for me, but I would still leave the final approval to @shsms as I'm not very familiar with formulas code.

"""
self.num_samples += 1

if self.num_samples >= 3:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this number a constant, like MINIMUM_STABLE_SAMPLES (there is probably a better name).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not extract because this is the kind of very specific thing for which there is no obvious name, and adding one for the sake of it often makes it more confusing. If you have a good name and think that it is helpful to use that, please share.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is worth extracting even if the name is not great to give it more visibility, as magic numbers buried deep in the code might be easy to miss otherwise. I would ask AI for a name to see if they can do better than us...

)


class TestCoalesceFunction:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you split out some tests into their own file, I wonder what's the relationship between formulas and functions. Wouldn't it make also sense to separate test for coalesce function to its own file, or make these tests part of the existing TestFormulas? Sorry if the question is silly, I'm not familiar with this code (so I didn't follow all the test logic) to assess it.

Copy link
Contributor Author

@simonvoelcker simonvoelcker Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I split out the validation tests because (a) the file went over the 1000 line limit and (b) the validation tests were the most obvious candidate to move to their own file. What is left is indeed a bit of a mix of tests for formula evaluation (includes functions used in formulas) and tests for the subscribe/unsubscribe behavior of the coalesce function specifically. If would probably make sense to have test_formula_evaluation.py for the former, but then, what is the other one? test_formulas.py is the lowest common abstraction, IMO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can keep the name for general tests without a better grouping. For example

I also just noticed these tests are under a _formulas directory, so we can also either keep test_formula.py or use something like test_general.py, and I would remove the prefix from other files, like _formulas/test_evaluation.py, _formulas/test_validation.py, again, to keep the resulting test names as short as possible (and not redundant).

@simonvoelcker simonvoelcker force-pushed the coalesce_cautious_unsubscribe branch from 6a483f0 to 3c5bb91 Compare March 17, 2026 12:48
@simonvoelcker simonvoelcker added the cmd:skip-release-notes It is not necessary to update release notes for this PR label Mar 17, 2026
Signed-off-by: Simon Völcker <simon.voelcker@frequenz.com>
Signed-off-by: Simon Völcker <simon.voelcker@frequenz.com>
The statements removed here would remove component data subscriptions for components that come after a constant in Coalesce's parameter list. However, such subscriptions are never created. We only add subscriptions, one at a time, until data is available, which is always the case when we reach a constant in the parameter list.

Signed-off-by: Simon Völcker <simon.voelcker@frequenz.com>
@simonvoelcker simonvoelcker force-pushed the coalesce_cautious_unsubscribe branch from 3c5bb91 to 769be61 Compare March 17, 2026 13:03
@shsms
Copy link
Contributor

shsms commented Mar 17, 2026

Sorry, I'll take a look in the morning.

Comment on lines +1023 to +1035
# After 3 samples, the last subscription is dropped.
self.CoalesceSample(
values=[None, 12.0, 15.0],
expected_subscriptions=[True, True, True],
),
self.CoalesceSample(
values=[10.0, None, 15.0],
expected_subscriptions=[True, True, True],
),
self.CoalesceSample(
values=[None, 12.0, 15.0],
expected_subscriptions=[True, True, False],
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not be enough. Could you please also track the earliest_non_nil_param, for which the num_samples is valid? If earliest non-nil param changes, then you reset the num_samples.

Then we'll know that we have at least one stable stream, before we unsubscribe. Then you don't have to remove _unsubscribe_all_params_after either.

self.CoalesceSample(
values=[None, 12.0, 15.0],
expected_subscriptions=[True, True, False],
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add a test that cycles between sub and unsub a couple of times, with some stable period, at least some with more than 3 stable samples, before they change again? I think this is complicated logic, and now that we depend on 3 samples for each state change, the tests need a lot more samples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cmd:skip-release-notes It is not necessary to update release notes for this PR part:data-pipeline Affects the data pipeline part:docs Affects the documentation part:tests Affects the unit, integration and performance (benchmarks) tests

Projects

Status: To do

Development

Successfully merging this pull request may close these issues.

3 participants